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In this contribution, an algorithm for evaluating the capacity-achieving input covariance matrices for 
frequency selective Rayleigh MIMO channels is proposed. In contrast with the flat fading Rayleigh case, 
no closed-form expressions for the eigenvectors of the optimum input covariance matrix are available. 
Classically, both the eigenvectors and eigenvalues are computed numerically and the corresponding 
optimization algorithms remain computationally very demanding. 

In this paper, it is proposed to optimize (w.r.t. the input covariance matrix) a large system 
approximation of the average mutual information derived by Moustakas and Simon. The validity of this 
asymptotic approximation is clarified thanks to Gaussian large random matrices methods. It is shown 



that the approximation is a strictly concave function of the input covariance matrix and that the average 
mutual information evaluated at the argmax of the approximation is equal to the capacity of the channel 
up to a (j) term, where t is the number of transmit antennas. An algorithm based on an iterative 



waterfilling scheme is proposed to maximize the average mutual information approximation, and its 
convergence studied. Numerical simulation results show that, even for a moderate number of transmit 
and receive antennas, the new approach provides the same results as direct maximization approaches of 
the average mutual information. 

Index Terms 

Ergodic capacity, large random matrices, frequency selective MIMO channels 

I. Introduction 

When the channel state information is available at both the receiver and the transmitter of a MIMO 
system, the problem of designing the transmitter in order to maximize the (Gaussian) mutual information 
of the system has been addressed successfully in a number of papers. This problem is however more 
difficult when the transmitter has the knowledge of the statistical properties of the channel, the channel 
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state information being still available at the receiver side, a more realistic assumption in the context 
of mobile systems. In this case, the mutual information is replaced by the average mutual information 
(EMI), which, of course, is more complicated to optimize. 

The optimization problem of the EMI has been addressed extensively in the case of certain flat fading 
Rayleigh channels. In the context of the so-called Kronecker model, it has been shown by various authors 
(see e.g. [1] for a review) that the eigenvectors of the optimal input covariance matrix must coincide with 
the eigenvectors of the transmit correlation matrix. It is therefore sufficient to evaluate the eigenvalues of 
the optimal matrix, a problem which can be solved by using standard optimization algorithms. Similar 
results have been obtained for flat fading uncorrected Rician channels (El). 

In this paper, we consider this EMI maximization problem in the case of popular frequency selective 
MIMO channels (see e.g. 0, 0) with independent paths. In this context, the eigenvectors of the 
optimum transmit covariance matrix have no closed-form expressions, so that both the eigenvalues and the 
eigenvectors of the matrix have to be evaluated numerically. For this, it is possible to adapt the approach 
of developed in the context of correlated Rician channels. However, the corresponding algorithms are 
computationally very demanding as they heavily rely on intensive Monte-Carlo simulations. We therefore 
propose to optimize the approximation of the EMI, derived by Moustakas and Simon ([4]), in principle 
valid when the number of transmit and receive antennas converge to infinity at the same rate, but accurate 
for realistic numbers of antennas. This will turn out to be a simpler problem. We mention that, while El 
contains some results related to the structure of the argument of the maximum of the EMI approximation, 
does not propose any optimization algorithm. 

We first review the results of [4] related to the large system approximation of the EMI. The analysis 
of is based on the so-called replica method, an ingenious trick whose mathematical relevance has 
not yet been established mathematically. Using a generalization of the rigorous analysis of |6], we 
verify the validity of the approximation of El and provide the convergence speed under certain technical 
assumptions. Besides, the expression of the approximation depends on the solutions of a non linear 
system. The existence and the uniqueness of the solutions are not addressed in 0. As our optimization 
algorithm needs to solve this system, we clarify this crucial point. We show in particular that the system 
admits a unique solution that can be evaluated numerically using the fixed point algorithm. Next, we 
study the properties of the EMI approximation, and briefly justify that it is a strictly concave function of 
the input covariance matrix. We show that the mutual information corresponding to the argmax of the 
the EMI approximation is equal to the channel capacity up to a (j) term, where t is the number of 
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transmit antennas. Therefore it is relevant to optimize the EMI approximation to evaluate the capacity 
achieving covariance matrix. We finally present our maximization algorithm of the EMI approximation. It 
is based on an iterative waterfilling algorithm which, in some sense, can be seen as a generalization of [7 ] 
devoted to the Rayleigh context and of JS), O devoted to the correlated Rician case: Each iteration will 
be devoted to solve the above mentioned system of nonlinear equations as well as a standard waterfilling 
problem. It is proved that the algorithm converges towards the optimum input covariance matrix as long 
as it converge^. 

The paper is organized as follows. Section HTl is devoted to the presentation of the channel model, the 
underlying assumptions, the problem statement. In section [TlTJ we rigorously derive the large system 
approximation of the EMI with Gaussian methods and establish some properties of the asymptotic 
approximation as a function of the covariance matrix of the input signal. The maximization problem 
of the EMI approximation is then studied in section [TV] Numerical results are provided in section [V] 

II. Problem statement 
A. General Notations 

In this paper, the notations s, x, M, stand for scalars, vectors and matrices, respectively. As usual, 
||x|| represents the Euclidian norm of vector x, and ||M||, p(M) and |M| respectively stand for the 
spectral norm, the spectral radius and the determinant of matrix M. The superscripts (.) T and {.) H 
represent respectively the transpose and transpose conjugate. The trace of M is denoted by Tr(M). The 
mathematical expectation operator is denoted by E(-). We denote by 5ij the Kronecker delta, i.e. Sij = 1 
if i = j and otherwise. 

All along this paper, r and t stand for the number of receive and transmit antennas. Certain quantities 
will be studied in the asymptotic regime t — > oo, r — > oo in such a way that t/r — > c G (0,oo). 
In order to simplify the notations, t — > oo should be understood from now on as t — > oo, r — > oo 
and t/r — > c G (0,oo). A matrix M 4 whose size depends on t is said to be uniformly bounded if 
sup t ||Mi|| < oo. 

Several variables used throughout this paper depend on various parameters, e.g. the number of antennas, 
the noise level, the covariance matrix of the transmitter, etc. In order to simplify the notations, we may 
not always mention all these dependencies. 

'Note however that we have been unable to prove formally its convergence. 
April 13, 2011 DRAFT 



B. Channel model 

We consider a wireless MIMO link with t transmit and r receive antennas corrupted by a multi-paths 
propagation channel. The discrete-time propagation channel between the transmitter and the receiver is 
characterized by the input-output equation 

L 

y(n) = J2 H (0 s(n - I + 1) + n(n) = [H(z)]s(n) + n(n), (1) 

l=i 

where s(n) = si(n), . . . , s t (n)) T and y(n) = (yi(n), . . . , y r (n)) T represent the transmit and the receive 
vector at time n respectively. n(n) is an additive Gaussian noise such that K(n(n)n(n) H ) = a 2 I. H(z) 
denotes the transfer function of the discrete-time equivalent channel defined by 

H(z) = ]TH« z -('-i). (2) 

l=i 
Each coefficient H^ is assumed to be a Gaussian random matrix given by 

HO = 4(C (0 ) 1/2 WKC«)V2 5 (3) 

where W^ is a r x t random matrix whose entries are independent and identically distributed complex 
circular Gaussian random variables, with zero mean and unit variance. The matrices Q"> and & 1 ' are 
positive definite, and respectively account for the receive and transmit antenna correlation. This correlation 
structure is called a separable or Kronecker correlation model. We also assume that for each k / /, 
matrices H^" and H® are independent. Note that our assumptions imply that HS l > ^ for I = 1, . . . , L. 
However, it can be checked easily that the results stated in this paper remain valid if some coefficients 

(H.^)l=i,... t L are zero. 

In the context of this paper, the channel matrices are assumed perfectly known at the receiver side. 
However, only the statistics of the (H"')i=i t ...,L, i- e - matrices (&' , C"')i=i y ... : L, are available at the 
transmitter side. 

C. Ergodic capacity of the channel. 

Let Q(e 2l7TI/ ) be the t x t spectral density matrix of the transmit signal s(n), which is assumed to 
verify the transmit power condition 

- / 1 Tr(Q(e 2 ^))^ = l. (4) 

1 Jo 

Then, the (Gaussian) ergodic mutual information I(Q(.)) between the transmitter and the receiver is 
defined as 



/(Q(.)) = E 



w 



l 

log 




I r + 4TH(e 2i7rl/ )Q(e 2 ^)H(e 2i ^) 



dv 



(5) 
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where E-yv[.] = E(w,), =1 L [-]- The ergodic capacity of the MIMO channel is equal to the maximum 
of I(Q(.)) over the set of all spectral density matrices satisfying the constraint (@]). The hypotheses 
formulated on the statistics of the channel allow however to limit the optimization to the set of positive 
matrices which are independent of the frequency v. This is because the probability distribution of matrix 
H(e 2Mn/ ) is clearly independent of the frequency v. More precisely, the mutual information I(Q(.)) is 
also given by 



7(Q(.))=E H 



log 



I r + ^H(l)Q(e 2 ^)H(l) 



\H 



dv 



where H = Yli=i H = H(l). Using the concavity of the logarithm, we obtain that 



J(Q(.)) < E : 



H 



log 



I r + ^H(l) (£ Q(e 2i ™)dA H(1) H 



We denote by C the cone of non negative hermitian matrices, and by Ci the subset of all matrices Q of 
C satisfying jTr(Q) = 1. If Q is an element of Ci, the mutual information I(Q) reduces to 

1 



/(Q)=E 



H 



log 



I r + -^HQH^ 



<7" 



(6) 



Q h-> 7(Q) is strictly concave on the convex set Ci and reaches its maximum at a unique element 
Q* E Ci. It is clear that if Q(e 2im/ ) is any spectral density satisfying (O, then the matrix f Q(e 2mi, )dis 
is an element of Ci. Therefore, 



^H 



log 






I Q{e 2mv )dv\ H H 



< E 



H 



log 






In other words, 



/(Q(.)) < ^(Q* 



for each spectral density matrix verifying (@]). This shows that the maximum of function I over the set 

of all spectral densities satisfying © is reached on the set Ci. The ergodic capacity Ce of the channel 

is thus equal to 

Q E = maxJ(Q). (7) 

QeCi 

We note that property (O also holds if the time delays of the channel are non integer multiples of the 

symbol period, provided that the receiving filter coincides with the ideal low -pass filter on the [— Jp, J^] 

frequency interval, where T denotes the symbol period. If this is the case, the transfer function H(e 2l7TU ) 

is equal to H(e 2lnu ) = Yli=i H^ e~ 2lnUTl , where 77 is the delay associated to path I for I = 1, . . . ,L. 

The probability distribution of H(e 2t7TU ) does not depend on u and this leads immediately to ([7]). If the 

matrices (C^)j=i x all coincide with a matrix C, matrix H follows a Kronecker model with transmit 
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and receive covariance matrices i Ej=i & an( l C respectively lITOl . In this case, the eigenvectors of 
the optimum matrix Q* coincide with the eigenvectors of r Ez=i C ■ The situation is similar if the 
transmit covariance matrices (C''')( = i £ coincide. In the most general case, the eigenvectors of Q* 
have however no closed-form expression. The evaluation of Q* and of the channel capacity Qe is thus 
a more difficult problem. A possible solution consists in adapting the Vu-Paulraj approach (Q) to the 
present context. However, the algorithm presented in |5] is very demanding since the evaluations of the 
gradient and the Hessian of /(Q) require intensive Monte-Carlo simulations. 

D. The large system approximation of I(Q) 

When t and r converge to oo while t/r -¥ c, cG (0,oo), [4] showed that /(Q) can be approximated 
by I(Q) defined by 



/(Q) = log 



i + ]T^(Q)C 



(0 



1=1 



log 



i + q(X>(Q)C (0 J -a**(j>(Q)$(Q)J (8) 



where (<Ji(Q), . . . , S L (Q,)) T = 6(Q) and (<5i(Q), . . . , S L (Q)) T = S(Q) are the positive solutions of the 



for I = 1, . . . ,L, 



system of 2L equations: 

«« = /*(«) 

h = /z(k,Q) 
with k = («i, . . . , kl) T and k = (h\, . . . , k_l) t , and with 

/,(«) = iTr [CWt(k)] , 
/,(k,Q) = }Tr [Q 1 /2c(0QV2t(«,Q) 
where r x r matrix T(k) and t x t matrix T(k, Q) are defined by 



(9) 



(10) 



f («, Q) = a 2 ( I + £ L ^QV2cO)Qi/2 



(11) 



III. Deriving the large system approximation 

A. The canonical equations 

In 1H, the existence and the uniqueness of positive solutions to (O is assumed without justification. 
Moreover no algorithm is given for the calculation of the Si and $i, I = 1, . . . , L. We therefore clarify 
below these important points. We consider the case Q = I in order to simplify the notations. To address 
the general case it is sufficient to change matrices (C^)j=i ) ... ) i into (Q 1 ' 2 C^Q 1 ' 2 )i=i ) „. ) i in what 
follows. 
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Theorem 1: The system of equations (© admits unique positive solutions (5{)j=i .,l and (5j)j=i .,£, 
which are the limits of the following fixed point algorithm: 



Initialization: <5f 0) > 0, ~df ] > 0, I = 1, . . . , L. 



Evaluation of the ,5f n+1) and ^ (n+1) from «W = (<5f \ . . . , 8f ] ) T and 6 



d n )\T 



~Jn) 



(S\ 



(».) 



e(n + l) = ^(^ 



?("+!) 



/H<5 (n) ,i)- 



,^ n) ) T : 



(12) 



Proof: We prove the existence and uniqueness of positive solutions. 

1) Existence: Using analytic continuation technique, we show in Appendix |A] that the fixed point 
algorithm introduced converges to positive coefficients 6i and Si, I = 1, . . . ,L. As functions k \-> 
fl(H) and k i->- fi(n,I) are clearly continuous, the limit of (S^ n \d ) when n — > oo satisfies 
equation (O. Hence, the convergence of the algorithm yields the existence of a positive solution to 
©. 

2) Uniqueness: Let (<$, (5) and (<J', S ) be two solutions of the canonical equation (© with Q = I. 
We denote (T,T) and (T',T') the associated matrices defined by (fTTT) . where (k, k) respectively 
coincide with (5, 5) and (<5 ; , 5 ). Introducing e = 5 — S' = (e\, . . . , ei) T we have: 



(-1 



-Tr [cWt(T /_1 - T-^r 



(13) 



fe=i 



Similarly, with e = S — S = (ei, . . . , ei,) T , 



e^^^-^TrfcWtcOf'). 



!=1 



(14) 



And (fT3l and (fl4l can be written together as 



I a 2 A(T,T') 

a 2 A(T,f') I 



e 
e 



(15) 



with A fc j(T,T / ) = |Tr (C^TC^T') and A w (f,f ') = ±Tr(C( fc >fcWf '). We will now prove 
that p(M) < 1, with M = cr 4 A(T,T")A(T,T / ). This will imply that the matrix governing the 
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linear system (031 ) is invertible, and thus that e = e = 0, i.e. the uniqueness. 



IM 



ki.\ 



^ ^T Tr(C (fe) f C^f ')Tr(C(^TC( / )T'; 






- ^ Yl p(C (fc) TC (j) f ') Tr(C y) TC (i) T / 



i=i 



Thanks to the inequality |Tr(AB)| < v /Tr(AA H )Tr(BB H ), we have 

1 



Tr(C( fe )tC (j) T') 
Tr(C (i) TC (/) T' 



< A /A fci (T,T)A fe ,-(T',T'), 



<,/A^(T,T)A j7 (T',T') 



Using (fTTT ) in (fl6l) gives 

|M fc ,| < a 4 ^ ^A fcj (f)A fe ,(fOA ii (T)A, z (TO, 

where matrices A(T) and A(T) are defined by 

' A fc ,(T) = lTr(C( fc )TC«T) = A W (T,T), 

A«(T) = ^(C«T C«T ) = A W (T,T). 
Using Cauchy-Schwarz inequality, 

|M W | < a 4 



\ 



J2 A Jfei (T)A i ,(T) > ) ( X) Ay (T')A j7 (T') 

.7=1 ' \7 = 1 



(16) 



(17) 



(18) 



Hence, defining the matrix P by P M = J (a A A(T)A(T)) H J (a 4 A(T')A(T')) kl , we have \M ki \ < 
P kl Vk,l. Theorem 8.1.18 of flU then yields p(M) < p(P). Besides, Lemma 5.7.9 of [12] used 
on the definition of P gives: 



p(P) < Jp(o*A(T)A(TJ) Jp(a 4 A(f')A(T')). 



(19) 



Lemma [U (0 in Appendix O implies that p(cr 4 A(f )A(T)) < 1 and p(a 4 A(f')A(T / )) < 1, so 
that ( fl"9l finally implies: 

p(M) < p(P) < 1. 
This completes the proof of Theorem [Q ■ 
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B. Deriving the approximation of I(Q = It) with Gaussian methods 

We consider in this section the case Q = I t . We note I = /(It), I = -/"(It)- We have proved in the 
previous section the consistency of /(Q) definition. To establish the approximation of I(Q), [4] used the 
replica method, a useful and simple trick whose mathematical relevance is not yet proved in the present 
context. Moreover, no assumptions were specified for the convergence of /(Q) towards I(Q). However, 
using large random matrix techniques similar to those of 0, (H, it is possible to prove rigorously the 
following theorem, in which the (mild) suitable technical assumptions are clarified. 

Theorem 2: Assume that, for every j G {1,...,L}, sup t ||C^|| < +oo, sup t ||C^|| < +oo, 
inft (±TrCW) > and inf t UttC^A > 0. Then, 

7=7+0 ' 

Sketch of proof: The proof is done in three steps: 

1) In a first step we derive a large system approximation of Eh [TVS], where S = (HH fl + <7 2 I r ) 

is the resolvent of HH H at point — a 2 . Nonetheless the approximation is expressed with the terms 
ai = jE H [Tr (C^S)] , l = l,...,L, which still depend on the entries of E H [S]. 

2) A second step refines the previous approximation to obtain an approximation which this time only 
depends on the variance structure of the channels, i.e. matrices (C"^)^ 6 /i j „. t x,\ and (C®)i£U„ s i\. 

3) The previous approximation is used to get the asymptotic behavior of mutual information by a proper 
integration. 



Proof: We now sketch the three steps stated above. We provide the missing details in the Appendix. 
1) A first large system approximation o/ Eh [TVS]: We introduce vectors a = {a\,...,ai) T and 
a. = (5i , . . . , q.l) T defined by 



ai = ±TV [C«Eh[S]] 



ai 



±Tr 



C«R 



for l = l,...,L, (20) 



where matrix R is defined by R(q) 



Using large random matrix 



techniques similar to those of Q, (H, the following proposition is proved in Appendix iBl 

Proposition 1: Assume that, for every j E {1, . . . , L}, sup t ||C^^|| < +oo, sup t ||C^^|| < +oo. Then 
Eh[S] can be written as 

E H [S]=R + T, (21) 
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where matrix T is such that jTr(TA) = (i) for any uniformly bounded matrix A and matrix R is 
defined by R(q) = \a 2 (i + £f =1 &jC® 



One can check that the entries of matrix T are (7372); nevertheless this result is not needed here. It 
follows from Proposition [Q that, for any r x r matrix A uniformly bounded in r, 



E H [Tr(SA)] = 7 Tr(RA) + , r . 



iE H [Tr(SA)] = - f 
Taking A = I gives a first approximation of Eh [Tr S] : 



E H [TrS] = TrR + 



(22) 



(23) 



Nonetheless matrix R depends on Eh[S] through vector a. 

2) A refined large system approximation o/EhITt S]: We first recall from paragraph IIII-AI that T is 
the matrix defined by (fTTT) associated to the solutions (8, 8) of the canonical equation (O with Q = I: 
T = ( a 2 ( I r + Y2i=i ^C® ) ) . We introduce the following proposition which will lead to the desired 
approximation of Eh [Tr S] : 

Proposition 2: Assume that, for every j G {1,...,L}, sup t ||C^'|| < +00, sup t ||C( J )|| < +00, 
inft (jTr C( J )) > and infj ( |Tr C^M > 0. Let A be a r x r matrix uniformly bounded in r, then 

^Tr(RA) = ^Tr(TA) + (± J . (24) 

The proof is given in Appendix [C] It relies on the similarity of the systems of equations verified by 
the (a h ai) and the (61, Si). Actually, taking A = C® in C3 yields m = iTr(C®R) + (£) and 
therefore 



±Tr 



Taking A = I r in §24$ together with (|23j leads to 



ai 



Oil 



C«[a 2 (l + Er=i%C(i) 



+ o(£) 



for / = 1, 



L. 



(25) 



E H [TrS] = TrT + 



(26) 



3) The resulting large system approximation of I: The ergodic mutual information / can be written 
in terms of the resolvent S: 



I = E 



H 



log 



Ir + 



HH 



11 



a"- 



E 



H 



log|a 2 S(cj 2 )| 



As the differential of g(A) = log |A| is given by g(A + 5A) = g(A) + Tr[A 1 SA] + o(\\5A\\), we have 
the following equality: 



dl 

do^ 



-E 



H 



Tr[S(a 2 )HH 



11 ] 



a* 



-E 



H 



Tr[I r - a 2 S(a 2 )} 



a* 
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where the last equality follows from the so-called resolvent identity 

a 2 S{a 2 ) = I r - S(ff 2 )HH IJ . (27) 

The resolvent identity is inferred easily from the definition of S(a 2 ). As I(a 2 = +00) = 0, we now 
have the following expression of mutual information: 

I(a 2 ) = ^"(j-E H [TrS(p)])dp. 

This equality clearly justifies the search of a large system equivalent of Eh [Tr S] done in the previous 



sections. Using (I26I ). the term under the integral sign becomes: 

L 

4 - E H [Tr S] = t ^ Wl + Eh [Tr (T - S)] , 
a 1=1 

as £ - TrT = Tr [((^T)" 1 - I r ) T] = Tr [fej^ c(0 ) T = t^iWl- We need to integrate 

e(t,a 2 ) = Eh [Tr (T — S)] on (p > 0, +00) with respect to a 2 . We therefore introduce the following 

proposition: 

Proposition 3: e(t, a 2 ) = Eh [Tr (T — S)] is integrable with respect to a 2 on (p > 0, +00) and 

/% t „W = o(I) 

Proof: We prove in Appendix iDl that there exists to such that, for t > to, \e(t, a 2 )\ < -^.P (-"t)i 
where P is a polynomial whose coefficients are real positive and do not depend on a 2 nor on t. Therefore 

J^e(t,a 2 )da 2 = 0(\). 

m 

We now prove that the term t £^ 5i5i corresponds to the derivative of I(a 2 ) with respect to a 2 . To this 
end, we consider the function Vo(cr 2 , k, k) defined by 

L 

V {a 2 , k, k) = log |I + C(k)\ + log |I + C(«)| - a 2 t ^ k,k,, (28) 

1=1 

where C(#c) = ££=1«|C3W and C(k) = Ef=i^ c(0 - Note that V (cr 2 ,M) = l(a 2 ). The derivative 
of I(a 2 ) can then be expressed in terms of the partial derivatives of Vrj. 

1=1 ' 1=1 ' 

It is straightforward to check that 

|_V,k,k) = -a 2 t(M^lt) ~ *), 

Kl (30) 

—-(a 2 ,K,k) = -a 2 t(fi(k) - Ki). 

April 13, 2011 DRAFT 



12 

Both partial derivatives are equal to zero at point (a 2 , d, 8), as (5,6) verifies by definition (O with 
Q = It. Therefore, 

i=i 

which, together with Proposition [3) leads to I = I + (|). 



C. The approximation I(Q) 

We now consider the dependency in Q of the approximation I(Q). We previously considered 
the case Q = I; to address the general case it is sufficient to change matrices (G®)i=\ l into 
(Qi/2c«Qi/2) ;=1 L in lln^Al and lirFBl Hence the following Corollary of Theorem^ 

Corollary 1: Assume that, for every j G {1,...,L}, sup 4 ||C^)|| < +oo, sup t ||C^')|| < +oo, 
inf 4 (|TrC^) > and inf t A min (C^) > 0. Then, for Q such as sup t ||Q|| < +oo, 

7(Q) = J(Q) + QV 

Note that the technical assumptions on matrices (C®)i=i l are slightly stronger than in Theorem|2]in 
order to ensure that inf* (±Tr QC^ ) > 0. 

We can now state an important result about the concavity of the function Q i-> I(Q), a result which 
will be highly needed for its optimization in section |W] 

Theorem 3: Q h-> I(Q) is a strictly concave function over the compact set Ci. 

Proof: We here only prove the concavity of I(Q). The proof of the strict concavity is quite tedious, 
but essentially the same as in [8] section IV (see also the extended version O). It is therefore omitted. 
Denote by (g) the Kronecker product of matrices. Let us introduce the following matrices: 

A« = I m ® C«, A W = I m ® CW, Q = I m (8) Q. 

We now denote 

U(z) =YH®z-V-V with H« = ^L(A( I )) 1 / 2 W i (AW) 1 /2, 



1=1 * /mt 

where W is a rm x im matrix whose entries are independent and identically distributed complex circular 
Gaussian random variables with variance 1. Introducing I m (Q) the ergodic mutual information associated 
with channel H(z): 

/ m (Q)=EHlog 



i+ hqh" 



a 2 
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where H = H(l) = Yli^ • Using the results of 0] and Theorem |2j it is clear that I m (Q) admits an 
asymptotic approximation I m (Q). Due to the block-diagonal nature of matrices A^, A and Q, it is 



straightforward to show that <5j(Q) = <5/(Q),(5/(Q) = <5j(Q) and that, as a consequence, 

1 
m 

and thus 



-Im(Q) = /(Q), 



lim -I m (Q) = 7(Q). 

As Q H im(Q) is concave, we can conclude that /(Q) is concave as a pointwise limit of concave 
functions. ■ 

As I(Q) is strictly concave on Ci by Theorem [3] it admits a unique argmax that we denote Q„,. We 
recall that /(Q) is strictly concave on Ci and that we denoted Q* its argmax. In order to clarify the relation 
between Q* and Q* we introduce the following proposition which establishes that the maximization of 
/(Q) is equivalent to the maximization of /(Q) over Ci, up to a (j) term. 

Proposition 4: Assume that, for every j S {1,...,L}, sup t ||C^'|| < +oo, sup t ||C( J )|| < +oo, 
inf t A min (C^) > and inf t A min (C^) > 0. Then 

/(QJ = /(QJ + Q 

Proof: The proof is very similar to the one of fU Proposition 3]. Assuming that sup t ||Q*|| < +oo 
and sup t ||Q*|| < +°o we can apply Theorem Q] on Q,,, and Q*, hence 

Z(QJ-/(Qj] + [T(Q*)-I(Q*)] = [/(QJ-7(Qj] + [/(QJ -/(Qj] = Q 

Besides /(QJ — /(QJ > and /(QJ — I(Q*) > 0, as Q* and Q* respectively maximize /(Q) and 
7(Q). Therefore /(QJ - /(QJ = (±). 

One can prove sup t ||Q*|| < +oo using the same arguments as in ||8] Appendix III]. It essentially lies 
in the fact that Q^ is the solution of a waterfilling algorithm, which will be shown independently from 
this result in next section (see Proposition UJ). 

Concerning sup t ||Q*|| < +oo, the proof is identical to (8j Appendix III], one just needs to replace 
T^A by ^ZtA^) 1/2 ^(C^ 1/2 and ^-^cf WCf by -L(CW) 1/2 Wl (C«) 1/2 
in the definition of H. Then Sj, defined in [8, (134)], becomes 



^(C«)^R, (E (C«) 1/2 ^ + (C«) 1/2 u,) } + iuj 



5, = 2Re I >/- (CW)^ I 2J (CW) V ^J + (C^T'S I ^^(CW) 1 ^-^) 1 ^, 



(31) 
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where Rj has the same definition as in (H, zy is the j th column of matrix Wj(C^) and Zj = 
•z\j = vlj + Uj with Uj the conditional expectation Uj = E zij (zi,fc) 1<fc<4 fc / • . As the vector uj~ is 
independent from Rj and from z; &, fc = 1, . . . , t, I = 2, . . . , L we can easily prove that the first term 
of the right-hand side of (f3TT > is a (j). The second term of the right-hand side of (OTT ) is moreover 
close from pj = ~ [(C^) -1 ] . . Tr (RjCW). In fact it is possible to prove that there exists a constant 
C\ such that E [(Sj - pj) 2 ] < ^f (see [8"| for more details). 
The rest of the proof of [8 , Proposition 3 (ii)] can then follow. 



IV. Maximization algorithm 

Proposition 0] shows that it is relevant to maximize I(Q) over Ci. In this section we propose a 
maximization algorithm for the large system approximation /(Q). We first introduce some classical 
concepts and results needed for the optimization of Q i— > i"(Q). 

Definition 1: Let <fi be a function defined on the convex set Ci. Let P, Q € Gi. Then <f> is said to be 
differentiable in the Gateaux sense (or Gateaux differentiable) at point Q in the direction P — Q if the 

following limit exists: 

Hm </>(Q + A(P-Q))-<KQ) 

A->0+ A 

In this case, this limit is noted (</>'(Q),P — Q). 

Note that <£(Q + A(P - Q)) makes sense for A G [0, 1], as Q + A(P - Q) = (1 - A)Q + AP naturally 
belongs to Ci. We now establish the following result: 

Proposition 5: Then, for each P, Q £ Ci, functions Q h-> 5/(Q), Q t-> 5i(Q), I = 1, . . . ,L, as well 
as function Q i-> I(Q) are Gateaux differentiable at Q in the direction P — Q. 

Proof: See Appendix [E] ■ 

In order to characterize matrix Q^ we recall the following result: 
Proposition 6: Let <\> : Ci — > M. be a strictly concave function. Then, 
(i) (f> is Gateaux differentiable at Q in the direction P — Q for each P, Q G Ci, 
(ii) Qopt is the unique argmax of <fi on Ci if and only if it verifies: 

VQ € 6i, (<t>'(Qopt), Q - Qo P t) < 0. (32) 

This proposition is standard (see for example Chapter 2 of fTBl ). 
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In order to introduce our maximization algorithm, we consider the function V(Q, k, k) defined by: 

L 

V(Q,k,k) =log|I + C(K)|+log|I + QC(#c)| -aH^Kih. (33) 

1=1 

We recall that C(*c) = Ya=i k i& 1) and C(k) = ^f =1 ftjCW. Note that we have V(Q, S(Q), 6(Q)) = 
/(Q). We have then the following result: 

Proposition 7: Denote by 5* and 6* the quantities 5(Q*) and (5(Q,J. Matrix Q^ is the solution of 
the standard waterfilling problem: maximize over Q S Si the function log |I + QC(5*)|. 

Proof: We first remark that maximizing function Q h-> log |I+QC(<J*)| is equivalent to maximizing 
function Q i— > V(Q, S*,S*) by (l33l . The proof then relies on the observation hereafter proven that, for 
each P G Ci, 

(T / (QJ,P-Q,) = (V / (Q„ ( 5„^),P-Q,), (34) 

where (V' (Q^, 5*, 5*),P — Q*) is the Gateaux differential of function Q i— > V(Q, d*, 6*) at point Q^ 
in direction P — Q^. Assuming (1341 is verified, d32l ) yields that (V(Q^,6*,6*),P — Q*) < for each 
matrix P G 6i. And as the function Q i-)- V(Q,5*,<5*) is strictly concave on Ci, its unique argmax on 
Ci coincides with Q^. 
It now remains to prove d34"l ). Consider P and Q € Q\. Then, 

L 

dm 



(T'(Q),P - Q) =<V(Q,<5(Q),tf(Q)),P - Q) + E ^-(Q,5(Q),«(Q))^(Q),P - Q) 



L 



!=1 K ' 
Similarly to (|30l ), partial derivatives J^-(Q,tt, /c) = — <t 2 £(//(k, Q) — «/) and J^(Q,k,k) = 
—a 2 t(fi(k) — Ki) are equal to zero at point (Q, 5(Q),5(Q)), as (<$(Q), <5(Q)) verifies © by definition. 
Therefore, letting Q = Q,,, in (I35T ) yields: 

<7'(QJ, P - QJ = (V(Q„ <5(QJ, 5(QJ), P - Q*>. 



Proposition |7] shows that the optimum matrix is solution of a waterfilling problem associated to the 
covariance matrix C(<5*). This result cannot be used to evaluate Q^, because the matrix C(<$*) itself 
depends of Q,,,. However, it provides some insight on the structure of the optimum matrix: the eigenvectors 
of Q^ coincide with the eigenvectors of a linear combination of matrices & l \ the 5j(Q*) being the 
coefficients of this linear combination. This is in line with the result of [4] Appendix VI. 
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We now introduce our iterative algorithm for optimizing I(Q): 

• Initialization: Qo = I. 

• Evaluation of Qk from Qfe_i: (S^ k ',S ) is denned as the unique solution of ([9]) in which Q 



Qfe_i. Then Q^ is defined as the maximum of function Q i— > log 



I + QC(*W) 



on d. 



We now establish a result which implies that, if the algorithm converges, then it converges towards 
the optimal covariance matrix Q,,,. 
Proposition 8: Assume that 

lim gQc) _ 6 (k-i) = Hm g(k) _ ~ d (k-l) = Q (36) 

fc— >oo k— >oo 

Then, the algorithm converges towards matrix Q,,,. 

Proof: The sequence (Qk) belongs to the set Ci. As Ci is compact, we just have to verify that 

every convergent subsequence (Qf(k))k£N extracted from (Qk)keN converges towards Q^. For this, we 

denote by Q^.* the limit of the above subsequence, and prove that this matrix verifies property (l32l ) with 

<fi = I. Vectors <5^w +1 and S are defined as the solutions of (O with Q = Q^(fc). Hence, due to 

the continuity of functions Q <—■ 6AQ) and Q i->- SAQ), sequences (<^( fc ) +1 ) and [S ) 

V /feeN V /fceN 

converge towards 5^'* = S(Q 1 p^) and S ' = <5(Q,/,.*) respectively. Moreover, 16^'*, S ' J is solution 
of system (|9]) in which matrix Q coincides with CL, *. Therefore, 

As in the proof of Proposition |7J this leads to 

<l'(Q^),P-Q^> = (V'(q^,S^,~S^),P-Q^) (37) 

for every P € 6i. It remains to show that the right-hand side of (I37T ) is negative to complete the proof. 

li/>(k) 

(V , (Q^ fc ),^ (fc ),^ (fc) ),P-Q^ (fc) )<0 VPeCi. (38) 

By condition d36l ), sequences (5^,(m) and (<^/,(fe)) a ^ so converge towards S"^'* and 6 ' respectively. Taking 
the limit of (l38l) when k — > oo eventually shows that (V(Q-t i) .,5^ > *,5^ i *),P — Qw, *) < as required. 



To conclude, if the algorithm is convergent, that is, if the sequence of (Qfc)fceN converges towards 
a certain matrix, then the 6; = 5i(Qk-i) an d the 5; = Si(Qk-i) converge as well when k — >• oo. 
Condition (|36l ) is then verified, hence, if the algorithm is convergent, it converges towards Q^. Although 
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the convergence of the algorithm has not been proved, this result is encouraging and suggests that the 
algorithm is reliable. In particular, in all the conducted simulations the algorithm was converging. In any 
case, condition (l36l ) can be easily checked. If it is not satisfied, it is possible to modify the initial point 
Qo as many times as needed to ensure the convergence. 

V. Numerical Results 

We provide here some simulations results to evaluate the performance of the proposed approach. 
We use the propagation model introduced in 0, in which each path corresponds to a scatterer cluster 
characterized by a mean angle of departure, a mean angle of arrival and an angle spread for each of 
these two angles. 

In the featured simulations for Fig. |l(a)| (respectively Fig. |l(b)| ), we consider a frequency selective 
MIMO system with r = t = 4 (respectively r = t = 8), a carrier frequency of 2GHz, a number of 
paths L = 5. The paths share the same power, and their mean departure angles and angles spreads 
are given in Table U in radians. In both Fig. |l(a)| and |l(b)[ we have represented the EMI /(It) (i.e. 
without optimization), and the optimized EMI /(Q*) (i.e. with an input covariance matrix maximizing 
the approximation /). The EMI are evaluated by Monte-Carlo simulations, with 10 5 channel realizations. 
The EMI optimized with Vu-Paulraj algorithm |5] is also represented for comparison. 

(n) 

Vu-Paulraj 's algorithm is composed of two nested iterative loops. The inner loop evaluates Q; = 
argmax{/(Q) + Awrierlog |Q|} thanks to the Newton algorithm with the constraint jTr Q = 1, for a 
given value of ^barrier and a given starting point Q ■ Maximizing /(Q) + ^barrier l°g I Q I instead of 
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TABLE I 

PATHS ANGULAR PARAMETERS (in radians) 





1= 1 


1 = 2 


1 = 3 


1 = 4 


1 = 5 


mean departure angle 


6.15 


3.52 


4.04 


2.58 


2.66 


departure angle spread 


0.06 


0.09 


0.05 


0.05 


0.03 


mean arrival angle 


4.85 


3.48 


1.71 


5.31 


0.06 


arrival angle spread 


0.06 


0.08 


0.05 


0.02 


0.11 



TABLE II 

AVERAGE EXECUTION TIME (in seconds) 





L = 3 


L = A 


L = 5 


Vu-Paulraj 


681 


884 


1077 


New algorithm 


7.0 • 10" 3 


7.4 ■ 10 -3 


8.3 • 10" 3 



/(Q) ensures that Q remains positive semi-definite through the steps of the Newton algorithm; this is 
the so-called barrier interior-point method. The outer loop then decreases Carrier by a certain constant 
factor (j, and gives the inner loop the next starting point Qq = Q* . The algorithm stops when the 
desired precision is obtained, or, as the Newton algorithm requires heavy Monte-Carlo simulations for the 
evaluation of the gradient and of the Hessian of /(Q), when the number of iterations of the outer loop 
reaches a given number iV max . As in we took iV max = 10, \i = 100, 2 • 10 4 trials for the Monte-Carlo 
simulations, and we started with ^barrier = tjjo- 

Both Fig. l(a)| and |l(b)| show that maximizing /(Q) over the input covariance leads to significant 
improvement for /(Q). Our approach provides the same results as Vu-Paulraj's algorithm. Moreover our 
algorithm is computationally much more efficient: in Vu-Paulraj's algorithm, the evaluation of the gradient 
and of the Hessian of /(Q) needs heavy Monte-Carlo simulations. Table HTl gives for both algorithms the 
average execution time in seconds to obtain the input covariance matrix, on a 3.16GHz Intel Xeon CPU 
with 8GB of RAM, for a number of paths L = 3, L = 4 and L = 5, given r = t = 4. 



VI. Conclusion 

In this paper we have addressed the evaluation of the capacity achieving covariance matrices 
of frequency selective MIMO channels. We have first clarified the definition of the large system 
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approximation of the EMI and rigorously proved its expression and convergence speed with Gaussian 
methods. We have then proposed to optimize the EMI through this approximation, and have introduced 
an attractive iterative algorithm based on an iterative waterfilling scheme. Numerical results have shown 
that our approach provides the same results as a direct approach, but in a more efficient way in terms of 
computation time. 

Appendix A 
Proof of the existence of a solution 

To study (©, it is quite useful to interpret functions // and // as functions of the parameter —a 2 G M~, 
to extend their domain of validity from R~ to C — IR + , and to use powerful results concerning certain 
class of analytic functions. We therefore define the functions g{ip)(z) and g(tp)(z) as 

«7lM(*) 



gWW 



sM(z) 



g~iW(z) 



where gi(^)(z) = -Tr \c^T^(z) 



where gi($){z) = -Tr 



CjWf^ 



with ip(z) = (tj)i(z),...,ipL(z)) T , ij>(z) = (ipi(z), ..., ijjl{z)) t and where matrices T^(z) and T^(z) 
are defined by 

T^(z) = 



f^(z) 



L 



(39) 



(40) 



In order to explain the following results, we now have to introduce the concept of Stieltjes transforms. 

Definition 2: Let \x be a finitqj positive measure carried by M + . The Stieltjes transform of fj, is the 
function s(z) defined for z G C — IR + by 

d/j,(X) 



s(z) 



X-z 



(41) 



In the following, the class of all Stieltjes transforms of finite positive measures carried by M + is denoted 
S(R + ). We now state some of the properties of the elements of S(M + ). 



finite means that /x(R ) < oo 
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Proposition 9: Let s(z) G S(R + ), and jj, its associated measure. Then we have the following results: 

(i) s(z) is analytic on C — R + , 

(ii) Im(s(z)) > if Im(z) > 0, and Im(s(z)) < if Im(z) < 0, 
(iii) Im(zs(z)) > if Im(z) > 0, and hn(zs(z)) < if Im(z) < 0, 
(iv) s(-a 2 ) > for a 2 > 0, 



W l*(*)l < $Sh f^ * 



g 



(vi) /u(R + ) = lim —iysiiy). 

y— >oo 

Proof: All the stated properties are standard material, see e.g. Appendix of 1141 . 



Conversely, a useful tool to prove that a certain function belongs to S(R + ) is the following proposition: 

Proposition 10: Let s be a function holomorphic on C — R + which verifies the three following 
properties 

(i) Im(s(») > if Im(z) > 0, 
(ii) Im(zs(z)) > if Im(z) > 0, 
(iii) sup \iy s(iy)\ < oo. 

y>0 

Then s G S(R + ), and if p, represents the corresponding positive measure, then /i(R + ) = lim (—iy s{iy)). 

y— >oo 

Proof: see Appendix of lfl4l . ■ 

Now that we have recalled the notion of Stieltjes transforms and its associated basic properties we can 
introduce the following proposition: 

Proposition 11: Let (ipi, $1)1=1 l £ S(R + ). We define functions tfi(z) and (pi{z), I = 1, . . . ,L, as 

Vl {z) = iTr [cWT*(* 



z 



(pi[z) = ±Tr cWt^( 



z 



Then we have the following results 

(i) T^, T^ are holomorphic on C - R+, 
(h) ||T*(*)|| < ^ and ||T*(*)|| < ^ on C - R+, 

(iii) ipi G S(R + ) with the corresponding mass p% verifying ^/(R + ) = 4-TrC^, and (pi G S(R + ) with 
the corresponding mass fa verifying /2;(R + ) = jTrC^. 

Proof: For item (0 we only have to check that z (l + X^=i V'iO 2 )*-' ) i s invertible for every 
z G C — R + to prove that T^ is holomorphic on C — R + . The key point is to notice that, for any vector 
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v, for z such that Im(z) > 0, 

L L 

ImU H z(l + J2i>j( z ) c{J) )v} =Im{z}v H v + ^Im|z^(z)}v H C (j) v>0. 

3=1 3=1 

A similar inequality holds for hn(z) < 0, and the case z G M~ is straightforward. 

Item (lull ) can easily be proved thanks to Proposition \\0\ 

As for item ©, the proof is essentially the same as the proof of Proposition 5.1 item 3 in 031, and 
is therefore omitted. ■ 

We consider the following iterative scheme: 

j {n+1 \z)=g(^)(z), 

with a starting point (?/r )(z), ?/> (z)) in (S(R + )) . Item ((Inl) of Proposition ITTI then ensures that, for 
each n > 1, ^ n) (z) and tp (n \z) belong to (S(M+)) L . Moreover, 



(42) 



ty, (n+1) -^ (n) )(*) = 9iW (n) )(z)- gi (^ n -V)(z) 



Tr 



Ct^TW^-T^- 1 )^)) 



(43) 



where matrices T(")(z) and f W(*) are defined by T( ra )(z) = T^ " (z), f ( n \z) = T^ n \z). Using the 
equality A — B = A(B _1 - A -1 ) B, we then obtain 

TW(z) - T("- 1 )(z) = TW(z)[ - z^ ($ n_1) (z) - $ n) («)) C^')]t(™- 1 )(z). (44) 

Using (011) in (03]) then yields 



(^ +1) -W n) )W 



i=i 



z*r"-4 



0)Tr 






Tr 



C( | )tW(z)C<»)t( b - 1 )(«) 

£,(/)rj,(n)/ z \Q(j)rp(n-l)/ 



(45) 



The trace in the above expression can be bounded with the help of C max = max/j < ||C^||, ||C^|| >: 



.(n+l) ,(»»)<, 



(C^'- W B) )(*) < MjE f "C ) (z) IIc^IIIItW^iiiic^iiiit^- 1 )^)!! (46) 

3=1 

< IzlCl^llT^XzMT^Xzn £|($»> -$ n -») (a 

3=1 



(47) 
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We now consider z € C 

item © of Proposition [TT] Therefore, 



. Then T^ n \z) and T^ n l \z) have a spectral norm less than d / z 1 R+ - ) by 



(^ (n+1) -W n) )W 



rf7 2 
^ u mEH 



t (ci(z,M+)) : 



E 



< r 2 



«2,R+)) ; 



E 



A similar computation leads to 

{ ^n + l)_^n )){z) 

We now introduce the following maximum: 

M (n) (z) =r max 
Equations d48l and d49l can then be combined into: 

M (n) (z) ^(z)!!^ 1 '^ 



7(n) 7(71— 1) 



/ (n) , (n— 1) 



(*) 



(48) 



(*) 



(49) 



,a X {|(^ +1) -^ ) )(^)|,|(^f +1) -'0f ) )(-)|} • 



where e(z) = Cit^tti+tw, with £1 = max] - — f^ , LC 2 ^ > . We define the following domain: (7 



■(d(z,R+)) 

|zGC,(i(z,]R + ) > 2ex/K 



d(z,R+) 

(n) 



< 2J, with < K < 1. On this domain Z7 we have MW(z) < 
KM^- n ~ 1 \z). Hence, for z G U, %l)\ (z) and ip™ (z) are Cauchy sequences and, as such, converge. We 
denote by ipi(z) and ipj(z) their respective limit. 

One wants to extend this convergence result on C — K + . We first notice that, as ipY 1 ' is a Stieltjes 
transform whose associated measure has mass jTrC®, item ([v]) of Proposition [9] implies 

i Tr C« 



C(*) < 



d(z, 



-,(n) 



The ^ v are thus bounded on any compact set included in 



", uniformly in n. By Montel's theorem, 



[ip\ •') N is a normal family. Therefore one can extract a subsequence converging uniformly on compact 



sets of C — M + , whose limit is thus analytic over C — M + . This limit coincides with tpi on domain U. 
The limit of any converging subsequence of (V7 ) thus coincides with ipi on U. Therefore, these limits 
all coincide on C — M + with a function analytic on C — M + , that we still denote ifti. The converging 
subsequences of Ul>\ ) have thus the same limit. We have therefore showed the convergence of the 



towards an analytic function ipi . Moreover, as one can check that 



whole sequence (^ ) > on C — 

tpi verifies Proposition [TOl we have tpi(z) E S(K + ). The same arguments hold for the ipi{z). 

We have proved the convergence of iterative sequence (l42l ). Taking z = —a 2 then yields the convergence 



of the fixed point algorithm (IT2b . Note that the starting point (d^°\S ) only needs to verify 5 L > 0, 
Si > (I = 1, . . . , L), as any positive real number can be interpreted as the value at point z = —a 2 of 
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some element s(z) G S(M + ). Moreover, the limits ipi(z), ipi(z) (I = 1, . . . ,L) of the iterative sequence 
((42)) are positive for any z = —a 2 by item fly)) of Proposition |9l as they all are Stieltjes transforms. 
Therefore, the limits Si, Si (I = 1, . . . , L) are positive. 



Appendix B 

A FIRST LARGE SYSTEM APPROXIMATION OF E H [Tr S] - PROOF OF PROPOSITION \T\ 

In this section, if x is a random variable we denote by x the zero mean random variable x = x — E(x). 

We will prove Proposition Q] by deriving the matrix T defined by (l2"TI ). before proving that it satisfies 
|Tr (TA) = (^) for any uniformly bounded matrix A. To that end, as the entries of matrices H^ 
are Gaussian, we can use the classical Gaussian methods: we introduce here two Gaussian tools, an 
Integration by Parts formula and the Nash-Poincare inequality, both widely used in Random Matrix 
Theory (see e.g. Ifl6l0 . 

We first present an Integration by Parts formula which provides the expectation of some functionals 
of Gaussian vectors (see e.g. ifTTTD . 

Theorem 4: Let £ = [£i, . . . , £,m] T a complex Gaussian random vector such that E[£] = 0, E[££ T ] = 
and E[££ ] = ii. If V : (£) h-> T(£) is a C 1 complex function polynomially bounded together with its 
derivatives, then 

m=l 



m P m)\ = Y. n p™ E 



<9£* 



(50) 



In the present context we consider £ being the vector of the stacked columns of matrices H^, where the 
channels H"' are independent and follow the Kronecker model, i.e. Eh 
Then (|50l becomes 



itWtt(0* 



lplOrC) 



re ;' t tin jn 



E[Hfr(HW,...,HW)] 



1 



VVc 

4. / j / j im jn 
m=l n=l 



(OeW E 



or 

«tt(')* 
Utt. rn,n 



(51) 



The second useful tool is the Poincare Nash inequality which bounds the variance of certain functionals 
of Gaussian vectors (see e.g. iTToll . @). 

Theorem 5: Let £ = [£i, . . . , £a/] T a complex Gaussian random vector such that E[£] = 0, E[££ T ] = 
and E[££^] = ft. If T : (£) i-> T(£) is a C 1 complex function polynomially bounded together with its 
derivatives, then, noting V^T = [g, . . . , ^] T and V f T = [JE-, . . . , |E-] T , 

+ e\v^T(£) h ft V ? T(OJ • (52) 



var(r(0) < e v^y ft v € r(o 
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In the following we will use the Nash-Poincare inequality with £ being the vector of the stacked columns 
of independent matrices H^, where the channels H^' follow the Kronecker model. Then (l52l ) becomes 

or ( dv \* ( or \* or 

> > > C'"C'"E - 



va:(r(HW...,HW))<^ E E#S E 



i,m=l j,n=l 1=1 



BE® V^Hmn, 



+ 



^Hjfy dH#* 



(53) 



Using these two Gaussian tools we now prove Proposition [TJ In order to derive the matrix T defined by 

Eh [S] = R + T we study the entries of Eh [S] . Using the resolvent identity (l27l) we have ct 2 Eh [S pg ] = 
[I-E H [(SHH H )] ]. We evaluate E H [(SHH H ) P<? ] by first studying E H 
begins with an integration by parts on EQ- ( [511 ): 






Calculation 



y 



E 



H 



C tt(OttG')* 



-yc (,) c (,) E H 
i£c«c<<>E H 



3(Sp.<'-) 

»tt(0* 



( / ' 1 =(= ^-^ I" T)7 

SpiOyOg,m<)fc,n + H ?fc ^^ 



As 



dS ps 



SHi, 



9S- 



dHl, 



-S) = -(SH) pn S mi , we obtain 



i" 



E 



H 



O tt(OttO')* 
&pi"-ij n q k 



1 



1 



t C«C«E H [S P ^M' - 7 £c«c«e h [h^>(sh) p „s„ 
jCgc«E H [S p ^ M , - ^C«E H [<>(SH) pn (C«S) s 



1. 



H! fc (SH) pn -Tr(SC 



Oh 



Summing over i, I and /' then leads to: 

E H [(SH) pj H qk \ = J2 jE H [(SC«) p9 ]C« - £ C«E H 

To separate the terms under the last expectation, we denote r\i = jTr(SC^) = ai+rji, where ot\ = Eh[)?i]. 



t 



We can then write E 



H 



H* (SHW 



a/E; 



H 



H* fc (SH) pn |+E H 



H* fe (SH) pn 77 Z 



, hence 



E H [(SH) pj H^ = J2 7Eh[(SC«) m ]C® - £ a,C#E H [(SH) p „H* fc ] - S#*\ (54) 



71 .1 



where 3g' 9) = £ n E H H* fc (SH) pn £, »»C 



J jn 



. We here notice the presence of E 



H 



both sides of equality 454). Hence, let us denote A^' 9) = E H [(SH) pj H* fc 



;SH) pW H* fc 



on 



Then (1541 ) becomes 

_ =(P,9) 



./*■■ 



J jfc • 



Recalling that R = U 2 [h + T^i a iC {l) ) ) , this leads to 

A>,s) = a 2 ^ -E H [(SC W ) pg ]RCW - a 2 RH( p ' g ). 
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We now come back to the calculation of Eh[S P9 ] = -r(I r — Eh[(SHH^)]) p ,j by noticing that 
E H [(SHH H ) P J = £\E H [(SH^-H*,.] = 1V(A^)). Therefore 



E H [S pg ] = J f-J2 a,E H [(SC«) M ] + Tr (RH 



r(p.«) 



as d/ = jTr (RC^ J d20]>. Coming back to the definition of matrix H (p,<?) , we notice that Tr ( RE^ 
^ Z E H r/KSHCW T R T H jf/ )pg . Hence the matrix E H [S] can be written as 



And finally, 



E H [S] = ^Ir - E H [S] J2 «* c( ° + E Eh [??/SHC« t R t H h 



E H [S]=R + T, 



-i 



where R = (<r 2 (It + ^ aiC"')) and where the matrix T is defined as 



Y = a 2 Y^ E H ksHC (/)T R T H H 



R 



(55) 



(56) 



To end the proof of Proposition \T\ we now need to prove that ^Tr (TA) = (A) for any uniformly 
bounded matrix A. Let A be a r x r matrix uniformly bounded in r. Using (1561) . 

-Tr (TA) = — J^E H klr (sHC (z)T R T H^RA 



tE«. 



t/j Tr(SHC (i)r R T H H RA) 



We can now bound jTr (TA) thanks to Cauchy-Schwartz inequality. 



Tr (TA! 



< 



a 




E»/Eh 



— Y ^var(^)Jvar (Tr f SHC(0 T R T H ff RA) ) , 



(57) 



as JH,h 



var (x) for any random variable x. We first prove that var(r//) = (p-). The Nash 
Poincare inequality d53l ) states that 



var 



i,j,m,n,k 



drji ( dr)i 



y 



dnW \dH£L 



+ 



<9?ft \ * d?# 



y 



QTj(fc)* / QXXC 5 )* 

CJrl„„- / Otlmn 



(58) 



As 



as„ 



9H 



S f& S L = -S pi (H^S) ig we can derive 



du)V >vi 






-%> = -Tc(-^Lc^) = -E-^rCig = --(H"SC«S 



'j»- 



P:9 JJ 
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Similarly we obtain 



dm 



da\v* 



•i(SCWSH) 4i . Therefore © becomes 



varfaj) < ^ E Gffic^E [(H ff SC«SUH"SC«S); m + (SC«SH)^,(SC«SH) 



i,j,m,n,k 



4 J^E [Tr ((H^SCWS)CW(H H SC«S)^C 



(fe)T 



+Tr (c^ t (scWsh) h cW(scWsh; 



Then, using the inequality |Tr(B]B2)| < ||Bi ||Tr B2, where B2 is non-negative hermitian, for both 
traces in the above expression, 



var 



( ??i )<4||CW||2y|| C W||E[||S|| 4 TrfHC( fc ) T H H 



<±||C«||2^|| C ( fc )||||c( fc )||E[||S|| 4 Tr(HH H )] 



< 



t 3 

1 2LC. 



sup 



t 2 



E 



o" 



-Tr HH 



H\ 



(59) 



where the second inequality follows from ||S|| < -i- and from the definition of C sup : 

C sup = supC max = sup (max{||C( fc )||,||C«||U. (60) 

t t { k,i 1 J ) 

The hypotheses of Proposition Q] ensure that C sup < +00. We now prove that E [jTv (HH^)] = 
(1). Using the fact that the channels H^ are independent and follow the Kronecker model, that is 

E H 



ttWtt 

u ij H 


CO*" 

mn 


= h,ijC 


(i) Mi) 

,m jn ' 






E H 


-Tr(HH H ) 


i,j,k,l 


W « W (0*" 


t 2 ^ 

i 








<-V|ic« 


lllic ( °||< 


- rr 2 

7 -^°sup 



^EcI?C« = ^TrC(0TVC« 



(Or;® ' 

i,j,l I 



Therefore we proved that Eh[jTV(HH^)] = 0(1). Coming back to d59l ) gives var(rft) < 

F (i%^-)' hence var (^) = © (f)- 
We evaluate similarly the behavior of the second term of the right-hand side of (1571 ) and we obtain 

var (Tr (sHCW T R r H H RA)) < £ (l + ^ ) ||A|| 2 = (1), where k does not depend on a 2 nor on 

£. As var(?7z) = (p-), we eventually have 



which completes the proof of Proposition [Q 
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Remark 1: Note that, as var(^) < -^ (2^C* up L 2 ) and var ( Tr (SHCW T R T H H RA) J < 
^2 (k\\ A|| 2 (l + 4r)), d57j leads to jTr(YA) < ^P (4j), where P is a polynomial with real positive 
coefficients which do not depend on a 2 nor on t. 

Appendix C 

A REFINED LARGE SYSTEM APPROXIMATION OF E H [Tr S] - PROOF OF PROPOSITION [2] 

We prove in this section that |Tr(RA) = |Tr(TA) + (^) for any r x r matrix A uniformly 
bounded in r. We first note that the difference jTr (RA) — jTr (TA) can be written as 

-Tr ((R - T)A) = -Tr (R (T^ 1 - R" 1 ) TA) = -— ^(d, - <5/)Tr (rC^Ta) . (61) 

As ||T|| < \ and ||R|| < \, expression (|6"T1 ) yields 

I |Tr((R-T)A)|<;^p}^ ^|5,-l,|, (62) 

I 

where C sup < +oo is defined by (l60l ). We now consider the difference |Tr(RA) — |Tr(TA) for any 
t x t matrix A uniformly bounded in t, which can be derived similarly: 



a 



Tr (R-f)A <^piJ2\<*i- S i\- (63) 



l 



Taking A = C^ in (l62l . A = C^ in (|63T ) and using Proposition [Q gives 



|afc-*fc|<^-^Eh-*'l + (^)' < 64) 

C 2 
\ak-5k\ < — irY^laj-^l, (65) 



which leads to 



r C sup L 
t a 4 



: J £!«*-**! <o(^). 



Therefore it is clear that there exists a 2 such that | ck^ — 5^ | = (A) for a 2 > a 2 for any k G {1, . . . , -L}. 

In particular, | ck^ — <5^ | > for a 2 > <j\. We now extend this result to any a 2 > 0. To this end, 

similarly to Appendix |A] it is useful to consider a\ and 5i as functions of the parameter (—a 2 ) G IR~ and 
to extend their domain of validity from R~ to C— M + in order to use the results about Stieltjes transforms. 
The function Si(z) then corresponds to the function ^i(z) of Appendix lAl and therefore belongs to S(M + ) 
with an associated measure of mass -TrC®, for I = 1, . . . ,L. It is easy to check that function ai(z) 
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also belongs to S(R + ) with an associated measure of mass ^Tr C^ for any I € {1, ... , L}. Hence, by 
Proposition [9] ©, we can upper bound the Stieltjes transforms ai(z) and Si(z) on C — R + , yielding: 



\ai(z) - Si(z)\ <2- 



,TrC« 



< 2- 



t°sup 



'd(z,R+) ~ d(z, 

The (a>i(z) — Si(z))t£H are thus bounded on any compact set included in C— M + , uniformly in t. Moreover 
(cti(z) — Si(z))t£f$ is a family of analytic functions. Using Montel's theorem similarly to Appendix lAl 



t— >oo 



we obtain that Cty(z) — 6i(z)\ > on 



for any I € {1, . . . , L}, thus in particular 



I r- I t~ >00 _ 

\ai - Oil > 



for any a 2 > 0, I € {1, . . . , L}, which, used in (165T ). yields 



for any a 2 > 0, I G {1, 



\ai - Oil > 



, L}. Using (]63 in d62j and ((66]> in d§3) gives 

-Tr (A(R - T)) ^^> 0, 
-Tr(A(R-f)) ^^>0. 



(66) 

(67) 

(68) 
(69) 



We now refine d68l ) and d69l to prove that these two traces are (p-). Taking A = C^ in (loTT) leads 
to a k -5 k = -£■ Y,i{oli - 6i)Ti (C^TC^R) + }Tr (C^Y), where T = E H [S] - R, and similarly 
dfc — 5k = — x ^2i( a i ~ ^) r ^ r ( C^TC^R . We can rewrite these two equalities under the following 
matrix form: 



(l 2 L 



N(R,T,R,T 



a — S 
5 — a 



(70) 



where e is a L x 1 vector whose entries defined by Bk = jTr (C^T) verify Sk = (4), k = 1, . . . ,L, 
by Proposition [TJ and where matrix N(R, T,R, T) is defined by 



N(R,T,R,T) = a 2 



(71) 



B(R,T) 

B(R,f) 

where matrices B(R, T) and B(R, T) are L x L matrices whose entries are defined by B^(R, T) = 
iTr (C^TC^R) and B M (R,T) = \lr(C®T C^R). Besides, taking A = C®TC^ in (gg) and 
A = cWtC( fc ) in d69l leads to 



t— >oo i 



itc(cWtcWt), 



iTr(C( / )TC( fc )R) 

iTr (c«TC( fc )R) ^ }Tr (c«f C( k )f). 



(72) 
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Hence B«(R,T) ^^> A fcZ (T) and B^(R,f) ^^> A«(f), where matrices A(T) and A(T) are 



denned by (|T81) . We now introduce the following lemma: 

Lemma 1: Let T, T be the matrices defined by (fTTT) with (6,5) verifying the canonical equation (|9]) 
with Q = Ij). Let A(T) and A(T) be the L x L matrices whose entries are defined by Aki(T) = 
iTr (C^TC^T) and A W (T) = }Tr(C^tC^T) and M(T,f ) the matrix defined by 

„ r A(T) 
M(T,T) = o- 2 

A(t) 

Assume that, for every I G {1, . . . ,L}, sup 4 ||C^|| < +oo, sup 4 ||C^|| < +oo, inf t (jTrC®) > and 
infj ( jTr C^ ) > 0. Then there exists fco > and k\ < oo both independent of a 2 such that 



(i) sup t [p(M))]<l 



fcpo- 2 



(<7 2 +fcl)'- 



<1, 



(ii) sup t 
(Hi) sup t 



p a 4 A(f )A(T) 



< 1 



k Q <j 2 



[l 2L -W(T,T))- 1 



(<7 2 +fcl) 

< (a 2 +k 1 ) 



<1, 



fc CT 2 



iV 



where 



is the max-row l\ norm defined by III P III „ = max > |P,-i.| for aMxJV matrix 

je{i,...,M}^' 



P. 



Proof: Using the expression of T 1 = a 2 (l r + ^2k ^kC^), Si can be written as: 



TrfC w TT- 1 T) 



a 



2 2 L 

= — TY(C W TT) + — J2 4Tr(C W TC (fc) T). 

fe=i 

Similarly it holds that Si = ^Tr(C®T f ) + si £)jj =1 4Tr(C®TCWf ). Thus, 

A(T) 
A(t) 

where w and w are L x 1 vectors such that w; = ^-Tr(C w TT) and w; = ^Tr(C (/) TT). This equality 
is of the form u = M(T, T)u + v, with u = [5,5] and v = [w, w] , the entries of u and v being 
positive, and the entries of M(T, T) non-negative. A direct application of Corollary 8.1.29 of [11] then 
implies p(M(T,f )) < 1 - ^jl. 

We first briefly consider sup t {max-u/}. As ||T|| < \ and ||C^|| < C sup we have 







5 


+ 


w 






5 




w 



s ' - >(c ( "t) < ^C,u P . 



Similarly, as ||f || < -^ and ||C W || < C sup , 

Si 



I Tr (c^t) < L Ci 



a 



2 '-'sup- 



(73) 



(74) 
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t— >oo 



As t/r 1 '~ ru "> c > we have that sup t [r/t] < +00. Therefore sup 4 {maxuz} < % < 00, where 
Ao = C sup max{l,sup f [r/t]}. 

We now consider inf t {minuj} = inf t [min M {^Tr(C®TT), ^Tr(CWTT)||. We will use the 
Cauchy-Schwarz inequality: 



|Tr(AB)| < ^/Tr(AA^) ^/Tr(BB^). 

Taking A = (C«) 1/2 T and B = (CW) 1/2 in (75J leads to 

1 / m x (iTr(C«T)) 2 5 2 

-Tr ( C^TT ) > ^-r± —t>- = T 



(75) 



TrC(0' 



(76) 



We use again inequality (75), this time with A = (CW) 1/2 T 1 / 2 and B = T" 1 / 2 (CW) 1/2 . Then, 



1 / m n (±TrC« 

-Tr(C®T]> ! 



iltfCOT- 1 ) 



Thanks to (74]), ||T— ! || = \\a 2 (I r + ^ $CW)|| < a 2 + LC 2 up . Hence (77) leads to 



*Z> 



i Tr C« 



> 



Eventually, using (1781 ) in (1761) gives 



1, 



Tr ( C^TT ] > 



|TrCg 
^ + ^ 2 up ' 



lTrC(0 
(a 2 + LC 2 up ; 



2' 



(77) 



(78) 



(79) 



Similarly, we prove that 



1, 



Tr ( C®ff 1 > 



rTr CW 



(cr2 + ELC 2 up )^ 

Therefore inf t {mini uj} > ^ fc , 2 , where Ai = min, |inf t [±TrC (z) ] ,inf t ±TrC W | > and ki = 

LC 2 max {l,inf t [r/t]} = LC sup Ao < +00. Noting k$ = y 1 > we can now conclude about statement 

© of the lemma: 

supp(M(T,T)) < 1 < 1 - 

i sup t (maxzui) (cH + fci)^ 

As for statement © of the lemma, we note that |M(T, T) - AI 2 l| = ct 4 A(T)A(T) - A 2 I L . Hence 
p(a 4 A(t)A(T)) = (p(M(T,t))) 2 < (l - j^,) 2 < 1. 

Concerning statement diTTb . the proof is the same as in iTTBl Lemma 5.2]. Nonetheless we provide it 
here for the sake of completeness. As p(M(T,T)) < 1, the series ^ fegN M(T, T) converges, matrix 



I 2 l - M(T, T) is invertible and its inverse can be written as ( 1 2 l - M(T, T) 
Therefore the entries of ( \ 2 l — M(T, T) ) are non-negative and 



£ feeN M(T,f) fc . 



2L 



Ufc 



Y, (l 2 L-M(T,f) 
1=1 L 



2L 



kl 



vi > min(vj) ^ 



l=t L 



I 2L -M(T,f) 



J kl 
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Hence maxj. ^\_i 



negative, it eventually follows that: 



i 2L -m(t,t; 



-i 



A7 



< m f W . As the entries of ( I 2L - M( 

— min,(vi) » - 1 - 



(l 2L -M(T,t)) 



-i 



are non- 



sup 

t 



i 2 l-m(t,t; 



sup f (max;u ; ) (a 2 + fei) 5 



inf t (min/ v;) 



&oo- 



Remark 2: Lemma [Q (O) is used in the proof of Theorem Q] for the uniqueness of solutions to ©, but 
we took care not to use any consequences of this uniqueness in the proof above; this proof only requires 
the existence of solutions to I©. 

Remark 3: Unfortunately the assumptions inf t (|Tr C^) > and infWj'TrC®) > made in 
Lemma [U cannot be restrained, as |Tr (C W TT) < ^r(±TrC®) and similarly |Tr ( C®TT) < 
^(lTrC«)). 

The entries of B(R, T) and B(R, T) respectively converge to the entries of A(T) and A(T), hence 
there exists t such that, for t > t , 

• the matrix I 2L - N(R, T, R, T) is invertible, 



sup t 



(I 2L -N(R,T,R,t))- 1 



< 2(a 2 +k 1 ) 2 
— k a a 2 



Then, for t > t , CDJ yields 



a — S 
S — a 



I 2L -N(R,T,R,f; 



(80) 



Hence maxi{\ai-Si\,\ai-S~i\} < (I 2L - N(R,T,R,f ))" 1 max fe |e*|, andas e t = Tr (C^T) 

oo 

(A-) for I = 1, . . . , L, we eventually have that 



5/ - Si = ( -J 
Using (f8TT > in d62l completes the proof of Proposition |2 



(81) 



Appendix D 

INTEGRABILITY OF E H [TV (T - S)] - PROOF OF PROPOSITION [3] 

We first consider Eh [Tr (R — S)], which is equal to Tr Y by Proposition Q] As noted in Remark Q] of 
Appendix IB] we have | jTr( Y A) | < ^fjj-Po (^t) , where Po is a polynomial with real positive coefficients 
which do not depend on a 2 nor on t. Therefore 



|E H [Tr(R-S)]|< 






(82) 
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We now consider Tr (R - T). Similarly to AppendixO there exists t such that I 2 l - N(R, T, R, T) 



is invertible and such that 
Lemma [T] Then dTUl ) implies 



;i 2L -N(R,T,R,f))- 1 



< 



2(a 2 + k i y 
kncr 2 



, where ko and k\ are given by 



\&l -Si \< 



l 2L -N(R,T,R,f))- 1 



max £fc < — 5 max Sfc 

oo k ^0°" k 



where Sk = Tr (C^Y). Besides, Remark Q] of Appendix IB1 ensures that \ek\ < ^2 -Pi (^)> where Pi 
is a polynomial with real positive coefficients which do not depend on a 2 nor on t. Hence, 



\&i -<5*| < 



Pi(^)2(a 2 + A; 1 



cx 8 £ 2 



/cnO" 2 



for £ > to ,1 = 1 , 



,L. 



Using ([83]) in ([62]) with A = I r then gives: 



where ko 



2LC sup 



|Tr (R - T)| < -L fe 2 M + Q\ Pl (L ) l0i / > / . 



sup^jr/i} < +00. 



(83) 



(84) 



Eventually, ([82]) and (HI) yield |E H [Tr (T - S)]| < ^P (4*) for t > t , where the coefficients of the 
polynomial P (^j-) = (Pq (^-) + k% (l + ^7) Pi (^) ) are real positive coefficients and do not depend 
on a 2 nor on t. This completes the proof of Proposition [3] 



Appendix E 
Differentiability ofQh> <5(Q), Q >->• 5(Q) and Q m> 7(Q) - Proof of Proposition \5\ 

We prove in this section that, for Q, P G Si, functions S and S are Gateaux differentiable at point Q 
in the direction P — Q, where S, 5 are defined as the solutions of system ©. The proof is based on the 
implicit function theorem. 



Let P, Q e Ci. We introduce the function T 



ii 



x [0, 1] ->• M 2L defined by 



r(M,A) 



<5 - /(*) 

5-/(<5,Q + A(P-Q)) 



with /(d) = f 1(6),..., f L (S) and/(<5,Q) = /i(<5, Q), . . . , / L (<5, Q) 



, where the fi and the // 



are denned by ©. Note that <5(Q + A(P - Q)) and S(Q + A(P - Q)) are defined by T{6, 5, A) = 0. 
We want to apply the implicit theorem on a neighbourhood of A = 0; this requires the differentiability of 
T on this neighbourhood, and the invertibility of the partial Jacobian D, g g-JF(5, S, A)) at point A = 0. 



We first note that fi : S 1— > 



4rTr 



c«(i + E fc 4c( fc ) 



-1 



is clearly continuously differentiable 



on Wl. Concerning fi, we first need to use the matrix equality (I + AB) *B = B(I + BA) 1 , with 
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A = Q 1 / 2 and B = CQ 1 / 2 : 
fi(S,Q 



1 
aH 

1 



TV QV2c(')Qi/2 h + QV 2 c(5)Q 1 / 2 

Tr [c (/) Q(I + C(d)Q)- 1 j . (85) 

Recall that C(S) = J2k <^C (fc) . Function (S, A) \-> f(S, Q + A(P - Q)) is therefore clearly continuously 
differentiable on M +L x [0, 1]. Nevertheless, as we want to use the implicit theorem for A = 0, we need to 
enlarge the continuous differentiability on an open set including A = 0. Note that for A < 0, Q + A(P — Q) 
might have negative eigenvalues. Yet, det I + C(<5)(Q + A(P - Q)) > for 6 = <S(Q) and A = 0. 
Therefore it exists a neighbourhood V of (<5(Q),0) on which det I + C(<5)(Q + A(P - Q)) > 0. 
Defining ft by d85l ), the functions (6, A) h-> ft(6, Q + A(P — Q)) are continuously differentiable on V. 
Hence, T(d,8,X) is continuously differentiable on W L x V. 

We still have to check that the partial Jacobian D, & ?•,(]?(($, 5, A)) is invertible at the point A = 0. 



D - r - 

^{SJ) 1 ((5,(5,0) 



D *f® 



~ D df(5) 



-Dsf(Sfl) I - D gf(S,0) 



-a 2 A(T) 



-a 2 A(T) 



M(T,T), 



where A«(T) = -^(C^TC^T) and A H (f) = ^(Q^C^Q 1 / 2 * Q^C^Q 1 / 2 * ), and where 
T = T(5(Q)) and f = t(<5(Q)) are defined by (Qj]>. Matrices A(T), A(f ) and M(T,T) correspond 
to those defind in Lemma [Q but in which C® is replaced by Q 1 ' 2 cOQ 1 / 2 . Lemma [TJ (Q) therefore 
gives the invertibility of D, s ^T at point A = 0. 

We now are in position to apply the implicit function theorem, which asserts that functions A h-> <5(Q + 
A(P — Q)) and A i-> <5(Q + A(P — Q)) are continuously differentiable on a neighbourhood of 0. Hence, d 



and 8 are Gateaux differentiable at point Q in the direction P — Q. As I(Q) = log I + J2i $l(Q)C 



log 



i + QEiW 



(0 



1(0 



+ 



ff * Ei^(Q)^(Q) i,: is c l ear tnat Q ^ ^(Q) is as weu Gateaux 



differentiable at point Q in the direction P — Q. 
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